NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

LiteNoC: Developing Low-Cost Network-on-Chips for Deep Neural Networks

Ho, Khoa; Biglari, Siamak; Garrigus, Justin; Zhao, Hui; Mohanty, Saraju (June 2025, ACM)

With the rapid advance in Deep Neural Networks (DNNs), GPU’s role as a hardware accelerator becomes increasingly important. Due to the GPU’s significant power consumption, developing high- performance and power-efficient GPU systems is a critical challenge. DNN applications need to move a large amount of data between memory and the processing cores, which consumes a great amount of NoC power. However, prior proposed lossless data compressions cannot achieve optimal performance and energy efficiency because they did not take advantage of the error resilience of DNNs. In this work, we propose an NoC architecture that can reduce power consumption without compromising performance and accu- racy. Our technique takes advantage of the error resilience of DNNs as well as the data locality in the floating-point data representation of DNNs. Each data packet is reorganized by grouping data with similar bits such as in the exponents, and redundant bits are sent only once. We further compress the mantissa fields by appropri- ately selecting "proxy" values for data sharing the same exponent. Our evaluation results show that the proposed technique can ef- fectively reduce the amount of data transmitted and lead to better performance and power trade-offs while preserving accuracy.
more » « less
Full Text Available
Concentration polarization induced electro-osmosis around a charged dielectric microchannel corner

https://doi.org/10.1103/PhysRevFluids.10.044203

Zhao, Hui; Xuan, Xiangchun; Wu, Ning (April 2025, Physical Review Fluids)

Full Text Available
Ultrafast Microdroplet Digestion of Antibodies with Fc-Silencing Mutations

https://doi.org/10.1021/acs.analchem.5c01856

Yang, Yongqing; Xiao, Mengyuan; Lau, Jim; Knierman, Mike; Zhao, Hui; Qiu, Xi; Luo, Karen; Sausen, John; Gunawardena, Harsha P; Chen, Hao (June 2025, Analytical Chemistry)

Full Text Available
Survey of Hardware Acceleration of Genomic Analysis

Liu, Zhuren; Zhang, Shouzhe; Zhao, Hui (December 2024, IEEE)

As the Next-Generation Sequencing (NGS) techniques need to process enormous amounts of data, cost-efficientfand high-throughput computational analysis is essential in genomicsfstudy. Conventional computing platforms face great challenges to meet these demands due to their limited processing speed and scalability. Hardware accelerators, such as Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), and Application-Specific Integrated Circuits (ASICs), offer transformative solutions to these computational challenges. This paper provides a state-of-the-art review of the roles of hardware accelerators in genomic analysis.We performed a comprehensive and in-depth analysis of cutting-edge genomics hardware accelerators, such as GPUs, FPGAs, and ASICs, in the context of the specific algorithms they aim to enhance. Besides reviewing opportunities in hardware genome acceleration, we also provide insights into the challenges regarding processing speed, cost efficiency, and scalability.
more » « less
Full Text Available
Long-Lived Charge-Transfer Excitons in a Graphene–PTCDI–TiOPc Trilayer Heterostructure

https://doi.org/10.1021/acs.jpcc.5c00969

Scott, Ryan J; Fuller, Neno; Sadanandan, Adithya; Valencia-Acuna, Pavel; Chan, Wai-Lun; Zhou, Qunfei; Zhao, Hui (April 2025, The Journal of Physical Chemistry C)

Excitation transfer across the interfaces between graphene, perylenetetracarboxylic diimide (PTCDI), and titanyl phthalocyanine (TiOPc) was studied by using transient absorption and photoluminescence spectroscopy. Both photoluminescence quenching and transient absorption measurements confirm the presence of a type-II interface between PTCDI and TiOPc. While the graphene/PTCDI interface is expected to exhibit type-I behavior, transient absorption measurements indicate that only electrons transfer from PTCDI to graphene, with no evidence of hole transfer. Density functional theory calculations reveal significant ground-state electron transfer from graphene to PTCDI, resulting in band bending that prevents excited holes from transferring from PTCDI to graphene. This feature is exploited in a trilayer heterostructure of graphene/PTCDI/TiOPc, where the spatial separation of photoexcited electrons and holes in graphene and TiOPc, respectively, leads to the formation of long-lived photoexcitations with a lifetime of approximately 500 ps. Furthermore, spatially resolved transient absorption measurements reveal the immobile nature of these excitations, confirming that they are charge-transfer excitons rather than free electrons and holes. These results provide valuable insights into the complex interlayer photoexcitation transfer properties and demonstrate precise control over the layer population and the recombination lifetime of photocarriers in such hybrid heterostructures.
more » « less
Full Text Available
Type-I and type-II interfaces in a MoSe2/WS2 van der Waals heterostructure

https://doi.org/10.1063/5.0253709

Rafizadeh, Neema; Agunbiade, Gbenga; Scott, Ryan J; Vieux, Monique; Zhao, Hui (January 2025, Applied Physics Letters)

We report experimental evidence that MoSe2 and WS2 allow the formation of type-I and type-II interfaces, according to the thickness of the former. Heterostructure samples are obtained by stacking a monolayer WS2 flake on top of a MoSe2 flake that contains regions of thickness from one to four layers. Photoluminescence spectroscopy and transient absorption measurements reveal a type-II interface in the regions of monolayer MoSe2 in contact with monolayer WS2. In other regions of the heterostructure formed by multilayer MoSe2 and monolayer WS2, features of type-I interface are observed, including the absence of charge transfer and dominance of intralayer excitons in MoSe2. The coexistence of type-I and type-II interfaces in a single heterostructure offers opportunities to design sophisticated two-dimensional materials with finely controlled photocarrier behaviors.
more » « less
Full Text Available
Survey of Network-on-Chip (NoC) for Heterogeneous Multicore Systems

Biglari, Siamak; Hosseini, Farahnaz; Upadhyay, Aadesh; Zhao, Hui (December 2024, IEEE)

In recent years, Network-on-Chip (NoC) has emerged as a promising solution for addressing a critical performance bottleneck encountered in designing large-scale multi-core systems, i.e., data communication. With advancements in chip manufacturing technologies and the increasing complexity of system designs, the task of designing the communication sub- systems has become increasingly challenging. The emergence of hardware accelerators, such as GPUs, FPGAs and ASICs, together with heterogeneous system integration of the CPUs and the accelerators creates new challenges in NoC design. Conventional NoC architectures developed for CPU-based multi- core systems are not able to satisfy the traffic demands of heterogeneous systems. In recent years, numerous research efforts have been dedicated to exploring the various aspects of NoC design in hardware accelerators and heterogeneous systems. However, there is a need for a comprehensive understanding of the current state-of-the-art research in this emerging research area. This paper aims to provide a summary of research work conducted in heterogeneous NoC design. Through this survey, we aim to present a comprehensive overview of the current related research, highlighting key findings, challenges, and future directions in this field.
more » « less
Full Text Available
Using DIC‐δ ¹³ C Pair to Constrain Anthropogenic Carbon Increase in the Southeastern Atlantic Ocean Over the Most Recent Decade (2010–2020)

https://doi.org/10.1029/2024JC021586

Gao, Hui; Jin, Meibing; Zhao, Hui; Hussain, Najid; Cai, Wei‐Jun (November 2024, Journal of Geophysical Research: Oceans)

Abstract The southeastern Atlantic Ocean is a crucial yet understudied region for the ocean absorption of anthropogenic carbon (C_anth). Data from the A12 (2020) and A13.5 (2010) cruises offer an opportunity to examine changes in dissolved inorganic carbon (DIC), its stable isotope (δ¹³C), and C_anthover the past decade within a limited region (1∼3°E, 32∼42°S). For the decade of 2010–2020, C_anthinvasion was observed from the sea surface down to 1,200 m based on both DIC and δ¹³C data. The mean C_anthincrease rate (1.08 ± 0.26 mol m⁻² yr⁻¹) during this period accelerated from 0.87 ± 0.05 mol m⁻² yr⁻¹during the previous period (1983/84–2010). The δ¹³C‐based C_anthincrease closely matches the DIC‐based estimation below 500 m but is 26% higher in the upper ocean. This discrepancy is likely due to δ¹³C's longer air‐sea exchange timescale, seasonal variability in the upper ocean, and the chosen ratio of anthropogenically induced changes in δ¹³C and DIC. Finally, column inventory changes based on the two methods also exhibit very similar mean C_anthuptake rates. The paired DIC concentration and stable isotope dataset may enhance our ability to constrain C_anthaccumulation and its controlling mechanisms in the ocean.
more » « less
Full Text Available
Designing Reconfigurable Interconnection Network of Heterogeneous Chiplets Using Kalman Filter

https://doi.org/10.1145/3649476.3660389

Biglari, Siamak; Huang, Ruixiao; Zhao, Hui; Mohanty, Saraju (June 2024, ACM)

Heterogeneous chiplets have been proposed for accelerating high-performance computing tasks. Integrated inside one package, CPU and GPU chiplets can share a common interconnection network that can be implemented through the interposer. However, CPU and GPU applications have very different traffic patterns in general. Without effective management of the network resource, some chiplets can suffer significant performance degradation because the network bandwidth is taken away by communication-intensive applications. Therefore, techniques need to be developed to effectively manage the shared network resources. In a chiplet-based system, resource management needs to not only react in real-time but also be cost-efficient. In this work, we propose a reconfigurable network architecture, leveraging Kalman Filter to make accurate predictions on network resources needed by the applications and then adaptively change the resource allocation. Using our design, the network bandwidth can be fairly allocated to avoid starvation or performance degradation. Our evaluation results show that the proposed reconfigurable interconnection network can dynamically react to the changes in traffic demand of the chiplets and improve the system performance with low cost and design complexity.
more » « less
Full Text Available
Diversification of Pharmaceutical Manufacturing Processes: Taking the Plunge into the Non-PGM Catalyst Pool

https://doi.org/10.1021/acscatal.4c01809

Zhao, Hui; Ravn, Anne K; Haibach, Michael C; Engle, Keary M; Johansson_Seechurn, Carin_C C (June 2024, ACS Catalysis)

Full Text Available

« Prev Next »

Search for: All records